Align target sequence to selected transcript on protein level by sallybg · Pull Request #79 · VariantEffect/dcd_mapping2

sallybg · 2026-03-11T17:38:13Z

For protein-coding, non-reference variants, align protein-level target sequence to protein-level selected transcript sequence, and use alignment information to generate mappings. This handles within-sequence offsets between the target and selected transcript, which was not handled previously.

bencap

Thanks Sally, looks like a really nice improvement to the mapping logic and it's resolving the cases we identified soundly. There's just one potential issue we should resolve before merging.

bencap · 2026-03-12T16:11:51Z

src/dcd_mapping/vrs_map.py

    For protein annotations, these strings must be adjusted to match the offset defined by the start of the
    transcript sequence. For genomic annotations, these strings must be adjusted to match the coordinates of


We should adjust this documentation to match the new logic of aligned offsets.

bencap · 2026-03-12T16:26:41Z

src/dcd_mapping/vrs_map.py

+    protein_align_result = align_target_to_protein(
+        sequence, transcript.sequence, silent
+    )


We should add a type guard here isinstance(transcript, TxSelectError) like we have before accessing any properties of transcript in the loop below. If we attempt to access transcript.sequence while the transcript is of type TxSelectError, we'll get an AttributeError.

I think it's enough to just guard it and then continue to the loop if it is of the wrong error type, so that we get per-row handling of error messages.

Suggested change

protein_align_result = align_target_to_protein(

sequence, transcript.sequence, silent

)

protein_align_result = None

if isinstance(transcript, TxSelectResult):

protein_align_result = align_target_to_protein(

sequence, transcript.sequence, silent

)

bencap · 2026-03-12T16:29:31Z

src/dcd_mapping/vrs_map.py

@@ -767,7 +814,7 @@
        else:
            if _hgvs_pro_is_valid(row.hgvs_pro):


Given the comment preceding this one, we'd I think also want to guard against a NoneType protein_align_result here. I'm not convinced that can actually occur in practice (Often in this code base things are typed as Unions, but the types are heavily correlated such that if one object is type X, another must be type Y and not Z) , but it's a good guard to have.

I think transcript is fine typed as is given my notion on the correlated types and since it is pre-exists this change.

sallybg added 2 commits March 11, 2026 10:01

Align target sequence to selected transcript on protein level

f6be22d

Upgrade pandas

bd14182

sallybg requested a review from bencap March 11, 2026 17:38

bencap linked an issue Mar 11, 2026 that may be closed by this pull request

Align target sequence to selected transcript at the protein level #80

Open

bencap requested changes Mar 12, 2026

View reviewed changes

bencap changed the base branch from mavedb-main to mavedb-dev March 12, 2026 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align target sequence to selected transcript on protein level#79

Align target sequence to selected transcript on protein level#79
sallybg wants to merge 2 commits intomavedb-devfrom
align-target-to-protein

sallybg commented Mar 11, 2026

Uh oh!

bencap left a comment

Uh oh!

bencap Mar 12, 2026

Uh oh!

bencap Mar 12, 2026

Uh oh!

bencap Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		For protein annotations, these strings must be adjusted to match the offset defined by the start of the
		transcript sequence. For genomic annotations, these strings must be adjusted to match the coordinates of

		@@ -767,7 +814,7 @@
		else:
		if _hgvs_pro_is_valid(row.hgvs_pro):

Conversation

sallybg commented Mar 11, 2026

Uh oh!

bencap left a comment

Choose a reason for hiding this comment

Uh oh!

bencap Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

bencap Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

bencap Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants